Higher-Order Total Variation Classes on Grids: Minimax Theory and Trend Filtering Methods
نویسندگان
چکیده
We consider the problem of estimating the values of a function over n nodes of a d-dimensional grid graph (having equal side lengths n) from noisy observations. The function is assumed to be smooth, but is allowed to exhibit different amounts of smoothness at different regions in the grid. Such heterogeneity eludes classical measures of smoothness from nonparametric statistics, such as Holder smoothness. Meanwhile, total variation (TV) smoothness classes allow for heterogeneity, but are restrictive in another sense: only constant functions count as perfectly smooth (achieve zero TV). To move past this, we define two new higher-order TV classes, based on two ways of compiling the discrete derivatives of a parameter across the nodes. We relate these two new classes to Holder classes, and derive lower bounds on their minimax errors. We also analyze two naturally associated trend filtering methods; when d = 2, each is seen to be rate optimal over the appropriate class.
منابع مشابه
Supplement to “Higher-Order Total Variation Classes on Grids: Minimax Theory and Trend Filtering Methods”
the Kronecker sum of DD with itself, a total of d times. Using a standard fact about Kronecker sums, if ρ1, . . . , ρN denote the eigenvalues of DD then ρi1 + ρi2 + · · ·+ ρid , i1, . . . , id ∈ {1, . . . , N}, are the eigenvalues of (∆̃) ∆̃. By counting the multiplicity of the zero eigenvalue, we arrive at a nullity for ∆̃ of (k + 1). One can now directly check that each of the polynomials specif...
متن کاملAdaptive Piecewise Polynomial Estimation via Trend Filtering
We study trend filtering, a recently proposed tool of Kim et al. (2009) for nonparametric regression. The trend filtering estimate is defined as the minimizer of a penalized least squares criterion, in which the penalty term sums the absolute kth order discrete derivatives over the input points. Perhaps not surprisingly, trend filtering estimates appear to have the structure of kth degree splin...
متن کاملTotal Variation Classes Beyond 1d: Minimax Rates, and the Limitations of Linear Smoothers
We consider the problem of estimating a function defined over n locations on a d-dimensional grid (having all side lengths equal to n). When the function is constrained to have discrete total variation bounded by Cn, we derive the minimax optimal (squared) `2 estimation error rate, parametrized by n,Cn. Total variation denoising, also known as the fused lasso, is seen to be rate optimal. Severa...
متن کاملAdditive Models with Trend Filtering
We consider additive models built with trend filtering, i.e., additive models whose components are each regularized by the (discrete) total variation of their (k+1)st (discrete) derivative, for a chosen integer k ≥ 0. This results in kth degree piecewise polynomial components, (e.g., k = 0 gives piecewise constant components, k = 1 gives piecewise linear, k = 2 gives piecewise quadratic, etc.)....
متن کاملFast and Flexible ADMM Algorithms for Trend Filtering
This paper presents a fast and robust algorithm for trend filtering, a recently developed nonparametric regression tool. It has been shown that, for estimating functions whose derivatives are of bounded variation, trend filtering achieves the minimax optimal error rate, while other popular methods like smoothing splines and kernels do not. Standing in the way of a more widespread practical adop...
متن کامل